Types of Semantic Information Necessary in a Machine Translation Lexicon
نویسنده
چکیده
This paper describes research undertaken into assessing what types of semantic information (SI) are needed in a Machine Translation (MT) lexicon in order for ‘good’ translation quality to be attainable. We present a typology of semantic information, allowing the use of semantics in any MT system to be quantified in precise and absolute, rather than relative, terms. This typology was used to survey the SI present in twenty commercial and research MT systems. An automatically translated corpus was analysed to identify which types of semantics were necessary to achieve high quality translation. The survey and the analysis allowed us to conclude that four of the nine types of SI identified should always be included and that a further two complex SI types should be considered for inclusion pending further analysis. A formal lexicon specification incorporating these six SI types is presented.
منابع مشابه
A Comparative Study of English-Persian Translation of Neural Google Translation
Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...
متن کاملBoosting Lexical Resources for the Semantic Web: Generative Lexicon and Lexicon Interoperability
Computational lexicons can play a key role in the Semantic Web: aiming at making word content machine-understandable, they intend to provide an explicit representation of word meaning, so that it can be directly accessed and used by computational agents, such as large-coverage parsers, modules for intelligent Information Retrieval or Information Extraction. In all these cases, semantic informat...
متن کاملBilingual FrameNet Dictionaries for Machine Translation
This paper describes issues surrounding the planning and design of GermanFrameNet (GFN), a counterpart to the English-based FrameNet project. The goals of GFN are (a) to create lexical entries for German nouns, verbs, and adjectives that correspond to existing FrameNet entries, and (b) to link the parallel lexicon fragments by means of common semantic frames and numerical indexing mechanisms. G...
متن کاملA Comparison of Various Types of Extended Lexicon Models for Statistical Machine Translation
In this work we give a detailed comparison of the impact of the integration of discriminative and trigger-based lexicon models in state-ofthe-art hierarchical and conventional phrasebased statistical machine translation systems. As both types of extended lexicon models can grow very large, we apply certain restrictions to discard some of the less useful information. We show how these restrictio...
متن کاملبرچسبزنی خودکار نقشهای معنایی در جملات فارسی به کمک درختهای وابستگی
Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999